26 - Deep Learning - Architectures Part 1 [ID:16101]

50 von 139 angezeigt

Welcome back to Deep Learning and you can see I have a couple of upgrades.

So we have a much better recording quality now and I hope you can also see that I finally

fixed the sound problem so you should be able to hear me much better right now.

And we are back to a new session where we want to talk about a couple of exciting topics.

So let's see what I've got for you.

So today I want to start discussing different architectures and in particular in the first

couple of videos I want to talk a bit about the early architectures, the things that we've seen

in the very early days of deep learning. And we will follow them by looking into deeper models

in later videos and in the end we want to talk about learning architectures.

Instead of what humans might need just dozens of examples these things will need millions.

A lot of what we'll see in the next couple of slides and videos has of course been developed

for image recognition and object detection tasks. And in particular two datasets are very important

for these kinds of classes. This is the ImageNet dataset which you find in reference 11. It has

something like a thousand classes, more than 14 million images and subsets have been used for the

ImageNet large scale visual recognition challenges. It contains natural images of varying size so

a lot of these images have actually been downloaded from the internet.

There's also smaller datasets if you don't want to train with like millions of images right away.

So there's also very important the CyPhar datasets Cypher 10 and Cypher 100 which is 10 and 100

classes and there we only have like 50k training and 10k testing images. The images have reduced

size 32 by 32 in order to very quickly be able to explore different architectures pros and cons.

And if you have these smaller datasets then it also doesn't take so long for training. So this

is also a very common dataset if you want to evaluate your architecture. Okay so based on these

different datasets we then want to go ahead and look into the early architectures and I think one

of the most important ones is LeNet which was published in 1998 in reference number 9. And you

can see this is essentially the convolutional neural network as we have been discussing so far.

It has been used for example for letter recognition. We have the convolutional layers

where we have trainable kernels then pooling another set of convolutional layers

and another pooling operation and then towards the end we are going into fully connected layers

where we then gradually reduce and in the very end we have the output layer that corresponds to

the number of classes. We've been doing this for millennia. So this is a very typical CNN type of

architecture and this kind of approach has been used in many papers, has inspired a lot of work.

We have for every architecture here key features and you can see here most of the bullets are in

grey. That means that most of these features did not survive but of course what survived here was

convolution for spatial features. This is the main idea that is still prevalent. All the other things

like sub-sampling using average pooling, it's still used in non-linearity the Tungence-Huber-Bolikus.

So it's a not so deep model. Then it had a sparse connectivity between S2 and C3 layers as you see

here in the figure. So yeah also not that common anymore. The multi-layer perception as final classifier

is also something that we see no longer because it has been removed for for example fully

convolutional networks which is a much better approach and also the sequence of convolution,

pooling and non-linearity is kind of fixed and today we would do that in a much better way.

But of course this architecture is fundamental for many of the further developments and I think

it's really important that we are also listing it here. The next milestone that I want to talk

about in this video is AlexNet. Here you find the typical image. By the way you will find exactly

this image also in the original publication. So AlexNet is consisting of those two branches

that you see here and you can see that even in the original publication the top branch is cut in half.

So it's a kind of artifact that you find in many representations of AlexNet when they refer to this

figure. So the figure is cut into parts but it's not that severe because those two parts are

essentially identical and one of the reasons why it was split into two sub networks you could say

is because AlexNet has been implemented on graphical processing units. So this is implemented on GPUs

and it actually was already multi GPU. So the two branches that you see on the top they have been

Teil einer Videoserie :

Deep Learning

Presenters

Prof. Dr.-Ing. Andreas Maier

Zugänglich über

Offener Zugang

Dauer

00:16:21 Min

Aufnahmedatum

2020-05-18

Hochgeladen am

2020-05-19 00:46:17

Sprache

en-US

Deep Learning - Architectures Part 1

This video discusses the first early architectures developed in deep learning from LeNet to GooLeNet.

Video References:
Lex Fridman's Channel
Matrix Scene

Further Reading:
A gentle Introduction to Deep Learning

Tags

Per RSS abonnieren